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Abstract — Folksonomy mining is grasping the interest of web 
2.0 community since it represents tlie core data of social 
resource sharing systems. However, a scrutiny of the related 
works interested in mining /otoono/nies unveils that the time 
stamp dimension has not been considered. For example, the 
wealthy number of works dedicated to mining tri-concepts 
from folksonomies did not take into account time dimension. In 
this paper, we will consider a folksonomy commonly composed 
of triples <users, tags, resources> and we shall consider 
the time as a new dimension. We motivate our approach by 
highlighting the battery of potential applications. Then, we 
present the foundations for mining quadri-concepts, provide a 
formal definition of the problem and introduce a new efficient 
algorithm, called QuadriCons for its solution to allow for 
mining folksonomies in time, i.e., d-folksonomies. We also 
introduce a new closure operator that splits the induced search 
space into equivalence classes whose smallest elements are the 
quadri-minimal generators. Carried out experiments on large- 
scale real-world datasets highlight good performances of our 
algorithm. 

Keywords -Quadratic Context; Formal Concept Analysis; 
Quadratic Concepts; Folksonomies; Algorithm; Social Net- 
works 

I. Introduction 

Folksonomy (from folk and taxonomy) is a neologism 
for a practice of collaborative categorization using freely 
chosen keywords |1|. Folksonomies (also called social tag- 
ging mechanisms) have been implemented in a number of 
online knowledge sharing environments since the idea was 
first adopted by the social bookmarking site DEL.ICIO.US 
in 2004. The idea of a folksonomy is to allow the users 
to describe a set of shared objects with a set of keywords, 
i.e., tags, of their own choice. The new data of folksonomy 
systems provides a rich resource for data analysis, infor- 
mation retrieval, and knowledge discovery applications. The 
rise of folksonomies, due to the success of the social resource 
sharing systems (e.g., Flickr, Bibsonomy, Youtube, 
etc.) also called Web 2.0, has attracted interest of researchers 
to deal with the Folksonomy mining area. However, due to 
the huge size of folksonomies, many works focus on the 
extraction of lossless concise representations of interesting 
patterns, i.e., triadic concepts f2l f3l f?]. 

Recently, in fS), the new TriCons algorithm outperforms 
its competitors thanks to a clever sweep of the search space. 
Nevertheless, a scrutiny of these related work unveils that 
the time stamp dimension has not been considered yet. 
Time is considered one of the most important factors in 
detecting emerging subjects. Agrawal and Srikant show in 
|[6J the importance of sequential patterns which may be 



useful to discover rules integrating the notion of temporality 
and sequence of events. In our case, such rules shall be of 
the form : users which shared the movie "Alcatraz" using 
the tag prison will shared it later with the tag escape. 

With this paper, we initiate the confluence of threee 
lines of research. Formal Concept Analysis, Folksonomy 
mining and Mining Sequential Patterns. Formal Concept 
Analysis {FCA) [7 1 has been extended since fifteen years ago 
to deal with three-dimensional data [8]. However, Triadic 
Concept Analysis (TCA) has not garnered much attention 
for researchers until the coming of folksonomies as they 
represent the core data structure of social networks. Thus, 
we give a formal definition of the problem of mining all fre- 
quent quadri-concepts (the four-dimensional and sequential 
version of mining all frequents tri-concepts) and introduce 
our algorithm QUADRlCONS for its solution, which is an 
extension of the TriCons algorithm to the quadratic case. 
We also introduce a new closure operator that splits the in- 
duced search space into equivalence classes whose smallest 
elements are the quadri-minimal generators {QGs); QGs are 
helpful for a clever sweep of the search space ||5| lID. 

The remainder of the paper is organized as follows. In 
the next section, we motivate our conceptual and temporal 
clustering approach for solving the problem of mining all 
frequent quadri-concepts of a given dataset. We thoroughly 
study the related work in Section |III] In Section |IV] we pro- 
vide a formal definition of the problem of mining all frequent 
quadri-concepts. We introduce a new closure operator for the 
quadratic context as well as the QUADRlCONS algorithm 
dedicated to the extraction of all frequent quadri-concepts, 
in Section [V] In Section |Vl] carried out experiments about 
performances of our algorithm in terms of execution time, 
consumed memory and compacity of the quadri-concepts. 
Finally, we conclude the paper with a summary and we 
sketch some avenues for future works in Section IVIII 



II. Motivation : Conceptual and Temporal 
Clustering of Folksonomies 

The immediate success of social networks, i.e., social 
resource sharing systems is due to the fact that no specific 
skills are needed for participating 1 2 1 . Each individual user 
is able to share a web pag^j, a personal photcH, an artist he 
likeS or a movie he watchecfl without much effort. 

'http://del.icio.us 
^http://flickr.com 
'http://last.fm 
■*http://movielens.org 



The core data structure of such systems is a. folksonomy . 
It consists of three sets U, T, TZ of users assigning tags to 
resources as well as a ternary relation Y between them. To 
allow conceptual and temporal clustering from folksonomies, 
an additional dimension, i.e., D, is needed : time. Indeed, 
the special feature of folksonomies under study is their 
unceasing evolution ifTOl . Such systems follow trends and 
evolve according to the new user's taggings flTl. The 
increasing use of these systems shows that/o/foonomy-based 
works are then able to offer a better solution in the domain of 
Web Information Retrieval (WIR) fT2l by considering time 
when dealing with a query or during the user's taggings, 
i.e., by suggesting the appropriate trendy tags. Thus, a user 
which tagged a film or a website with a given tag at a specific 
date may assign a whole new tag at a different period under 
completely different circumstances. For example, a user that 
associate the website whitehouse.gov with the tags Bush and 
Iraq in 2004 might assign it the tags Obama and crisis 
nowadays. A more real and sadly true example leads users 
today associating Islam with the tag terrorism instead of 
Quran; besides, one may see the incessant evolution of the 
tag Binladen in social networks since September 2001 |fT3l . 

Within the new introduced dimension, i.e., time, our goal 
is to detect hidden sequential conceptualizations in folk- 
sonomies. An exemple of such a concept is that users which 
tagged "Harry Potter" will tag "The Prisoner of Azkaban" 
and then tag "The Order of the Phoenix", probably with the 
same tags. 

Our algorithm solves the problem of frequent closed 
patterns mining for this kind of data. It will return a set 
of (frequent) quadruples, where each quadruple {U, T, R, 
D) consists of a set U of users, a set T of tags, a set R 
of resources and a set D of dates. These quadruples, called 
(frequent) quadri-concepts, have the property that each user 
in U has tagged each resource in R with all tags from 
T at different dates from D, and that none of these sets 
can be extended without shrinking one of the other three 
dimensions. Hence, they represent the four-dimensional and 
sequential extension of tri-concepts. Moreover, we can add 
minimum support constraints on each of the four dimensions 
in order to focus on the largest concepts of the folksonomy, 
i.e., by setting higher values of minimum supports. 

In the remainder, we will scrutinize the state-of-the-art 
propositions aiming to deal with the folksonomy mining area. 

III. Related Work 

In this section, we discuss the different works that deal 
with folksonomy mining. Due to their triadic form, many 
researchers ill ID H focus on folksonomies in order to 
extract triadic concepts which are maximal sets of users, 
tags and resources. Tri-concepts are the first step to a 
various of applications : ontology building ||T|, association 
rule derivation |14|, recommendation systems fT5l| to cite 
but a few. Other papers focus on analysing the structure 



of folksonomies lfT6l or structure the tripartite network of 
folksonomies [10|. Recent works analyse \he folksonomy' s, 
evolution through time in order to discover the emergent 
subjects and follow trends lfT3l ifTTll ifTSll. 

Since we are going to mine quadri-concepts from d- 
folksonomies, which mimic the structure of quadratic con- 
texts, we look for works that deal with the four-dimensional 
data. In fT9l, inspired by work of Wille fSl extending Formal 
Concept Analysis to three dimensions, the author created 
a framework for analyzing n-dimensional formal concepts. 
He generalized the triadic concept analysis to n dimensions 
for arbitrary n, giving rise to Polyadic Concept Analysis. 
The n-adic contexts give rise, in a way analogous to the 
triadic case, to n-adic formal concepts. In |19|, the author 
gives examples of quadratic concepts and their associated 
quadri-lattice. Despite robust theoretical study, no algorithm 
has been proposed by Voutsadakis for an efficient extraction 
of such ?i-adic concepts. Recently, Cerf et al. proposed the 
Data-Peeler algorithm ID in order to extract all closed 
concepts from n-ary relations. Data-Peeler enumerates 
all the n-adic formal concepts in a depth first manner using 
a binary tree enumeration strategy. When setting n to 4, 
Data-Peeler is able to extract quadri-concepts. 

In the following, we give a formal definition of the 
problem of mining all frequent quadri-concepts as well as 
the main notions used through the paper. 

IV. The Problem of Mining all Frequent 

QUADRI-CONCEPTS 

In this section, we formalize the problem of mining aU 
frequents quadri-concepts. We start with an adaptation of the 
notion of folksonomy f2l to the quadratic context. 

Definition 1: (D-FOLKSONOMY) A d-folksonomy is a set 
of tuples F<j = iU, T, n, V, Y) where U, T, 7^ and V are 
finite sets which elements are called users, tags, resources 
and dates. YCZ//x7~x7?,x2? represents a quaternary 
relation where each y CY can be represented by a quadruple 
: y = {(u, t, r, d) \ u € U, t e T, r € n, d e V} which 
means that the user u has annotated the resource r using the 
tag t at the date d. 

Example 1: Table U depicts an example of a d-folksonomy 
¥d with U= [ui, U2, U3, U4},T= {ti, t2, h},'R= [ri, 
and T> = {di, ^2}- Each cross within the quaternary relation 
indicates a tagging operation by a user from U, a tag from 
T and a resource from 7?, at a date from T), i.e., a user has 
tagged a particular resource with a particular tag at a date 
d. For example, the user ui has tagged the resource ri with 
the tags ti, t2 and at the date di. 

The following definition introduces a (frequent) quadri- 
set. 

Definition 2: (A (FREQUENT) QUADRI-SET) Let = 
(U, T, TZ, T>, Y) be a d-folksonomy. A quadri-set of 
is a quadruple {A, B, C, E) with A CU, B CT, C CTZ 
and E CV such asAxBxCxECY. 
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D-Folksonomies have four dimensions which are com- 
pletely symmetric. Thus, we can define minimum support 
thresholds on each dimension. Hence, the problem of mining 
frequent quadri-sets is then the following: 

Problem 1: (Mining all frequent quadri-sets) Let = 
(U, T, TZ, P, Y) be a d-folksonomy and let minsuppu, 
minsuppt, minsuppr and minsuppd be (absolute) user- 
defined minimum thresholds. The task of mining all frequent 
quadri-sets consists in determining all quadri-sets (A, B, C, 
E) of ¥d with I j4 I > minsuppu, \ B \ > minsuppt, \ C 
I > minsuppr and | -B | > minsuppd- 

Our thresholds are antimonotonic constraints : If (^i, Bi, 
Ci, El) with Ai being maximal for Ai x Bi x Ci x Ei 
C Y is not u-frequenQ then all (A2, B2, C2, E2) with Bi 
C B2, Ci C C2 and Ei C E2 are not u-frequent either. The 
same holds symmetrically for the other three dimensions. In 
[8], the authors demonstrate that above the two-dimensional 
case, the direct symmetry between monotonicity and anti- 
monotonicity breaks. Thus, they introduced a lemma which 
results from the triadic Galois connection I20l induced by 
a triadic context. In the following, we adapt that lemma to 
our quadratic case. 

Lemma 1: (See also |19|, Proposition 2) Let {Ai, Bi, 
Ci, El) and (A2, B2, C2, E2) be quadri-sets with Ai being 
maximal for Ai x Bi x d x Ei C Y, for i = 1,2. If Bi 
C B2, Ci C C2 and Ei C E2 then A2 C Ai. The same 
holds symmetrically for the other three dimensions. In the 
sequel, the inclusion (Ai, Bi, Ci, Ei) C {A2, B2, C2, E2) 
holds if and only if Bi C B2, Ci C C2, Ei C E2 and A2 
c Ai. 

Example 2: Let F^; be the d-folksonomy of Table U and 
let 5*1 = {{u3, U4}, ts, {ri, r2}, {di, ^2}} and S2 = {{ui, 
U3, W4}, {t2, ts}, {ri, r2}, di} be two quadri-sets of F^. 
Then, we have Si C S2 since {us, U4} C {ui, U3, U4}, 
C {t2, h}, {ri, r2} C {n, r2} and di C {di, ^2}- 

As the set of all frequent quadri-sets is highly redundant, 
we consider a specific condensed representation, i.e., a 
subset which contains the same information : the set of all 
frequent quadri-concepts. The latter's definition is given as 
follows : 



Definition 3: ((FREQUENT) QUADRATIC CONCEPT) A 
quadratic concept (or a quadri-concept for short) of a d- 
folksonomy ¥d = (U, T, 7^, V, Y) is a quadruple (U, T, R, 
D) with U C U, T C T, R C n and D C V with U x 
T X R X D CY such that the quadruple (U, T, R, D) is 
maximal, i.e., none of these sets can be extended without 
shrinking one of the other three dimensions. A quadri- 
concept is said to be frequent whenever it is a frequent 
quadri-set. 

Problem 2: (Mining all frequent quadri-concepts) Let 

¥d = (U, T, TZ, T), Y) be a d-folksonomy and let minsuppu, 
minsuppt, minsuppr and minsuppd be user-defined min- 
imum thresholds. The task of mining all frequent quadri- 
concepts consists in determing all quadri-concepts (U, T, 
R, D) of ¥d with I [/ I > minsuppu, | T | > minsuppt, 
I i? I > minsuppr and | -D | > minsuppd- The set of all 
frequent quadri-concepts of ¥d is equal to QC = {qc \ qc = 
(U, T, R, D) is a frequent quadri-concept}. 

Remark 1: It is important to note that the extracted repre- 
sentation of quadri-concepts is information lossless. Hence, 
after solving Problem |2l we can easily solve the Problem 
[T] by enumerating all quadri-sets (A, B, C, E) such as it 
exists a frequent quadri-concept (U, T, R, D) such as A C 
U,B<ZT,CQR, E(^Dwd\A\> minsuppu, \ B \ 
> minsuppt, \ C \ > minsuppr and | i? | > minsuppd- 

In the following, we introduce the QUADRlCONS al- 
gorithm for mining all frequent quadri-Concepts before 
discussing its performances versus the Data-Peeler al- 
gorithm for the quadratic case in the section after. 

V. The QuadriCons Algorithm for Mining all 
Frequent Quadri-Concepts 

In this section, we introduce new notions that would be 
of use throughout the QUADRlCONS algorithm. Hence, we 
introduce a new closure operator for a d-folksonomy which 
splits the search space into equivalence classes as well as an 
extension of the notion of minimal generator 1 5 1 . Then, we 
provide an illustrative example of our algorithm. 

A. Main notions of QUADRlCONS 

Before introducing our closure operator for a d- 
folksonomy/qnadratic context, we define a general definition 
of a closure operator for a n-adic context. 

Definition 4: (CLOSURE OPERATOR OF A n-ADIC CON- 
TEXT) Let S = (Si, S2, ■ ■ ., Sn) be a n-set, with Si being 
maximal for S*! x . . . x S*,! C Y, of a rt-adic context 
K" with n dimensions, i.e., K" = (Vi, V2, . . ., Y). 
A mapping h is defined as follows : 

h(S) = h(Si, S2, ■ ■ ., Sn) = (Ci, C2, ■ ■ ., C„) such as : 

Ci = Si 

A C2 = {C'2 e 2?2 I (cl, 4, . . ., G Y V c\ e Ci, 
Vc| e 53, V< G Sn} 



'with regard to the dimension U. 



A C„ = {C; e P„ I (cl, 4, . . ., cj,_i, C;) e Y V cl e 
Ci, ... V<_i G C„-i} 

Proposition 1: h is a closure operator. 

Proof: To prove that h is a closure operator, we have 
to prove that this closure operator fulfils the three properties 
of extensivity, idempotency and isotony [21 1. 

(1) Extensivity : Let = (Si, S2, ■ ■ ■, Sn) be a n-set 
of K" => h(S) = (Cl, C2, . . ., C„) such that : d = 
Su C2 = {C^ e V2 I (cl, C^, 4, . . ., G Y V cl 
e Cl, V c| e 5*3, . . ., V cj, € Sn} ^ S2 since Ci = 
5i , . . ., = {C; e I?„ I (cl, c^, . . ., cl^_,, c:j e 
Y V cl e Cl, V c^ e C2, ... V c^_i G C„_i} D Sn 
since Ci = Si, C2 ^ 5*2, . . ., C„_i 3 S'„-i. Then, Ci = 
5i and S, C C, for i = 2, . . . n ^ S C h(S) icf., Lemma[B 

(2) Idempotency : Let S = (Si, S2, ■ ■ ■, Sn) be a n-set 
of K" ^ h(S) = (Cl, C2, . . ., Cn) ^ /i(Ci, C2, . . ., Cn) 
= (C[, C'2, . . ., C'J such that : C[ = Ci, C^ = {C^' e V2 \ 
(cl, C| , c?,, . . ., cl) e Y V cl e Cl, V c^ e ^3, ■ ■ V c;, 
G 5„} = C2 since Ci = ^1, . . ., C; = {C; G 2?„ | (cf , c| , 
• ■ <_i, C;') e Y V cl e Cl, V ci' G C^, . . . V cj,_i e 
C;_i} = C„ since we have Cl = Ci, C^ = C2, . . ., C;_i 
= C„_i. Then, = d for i = 1, . . . n ^ KKS)) = hiS) 

(3) Isotony : Let 5 = (5*1, 52, . . ., Sn) and S' = (S'l, S'2, 
. . ., S'n) be two n-sets of K" with 5* C S", i.e., S'l C Si 
and Si C S'ifori = 2, ...n (cf., Lemma[B. We have h(S) 
= (Cl, C2, . . ., Cn) and h(S') = (C[, C^, . . ., C;) such that 

. Cl = S'l, Cl = S'l and S'l C Si ^ Cl C Ci 
. C2 = {C^ e V2 I (cl, C^, cl, . . ., cj,) e Y V ci e Ci, 
V c?, e S3, . . ., V c'n e Sn} and C^ = {C^' e P2 I (cl, 
C| , c|, . . ., clJ e Y V cl e Cl, V c| G S3, . . ., V c^ 
G S„} ^ C2 C C2 since Si C S- for i = 3, . . . n and 
Cl C Cl. (cf.. Lemma [B 

. C„ = {C; e P„ I (cl, cl, . . ., c;,_i, C;) e Y V cl e 
Cl, V c^ G C2, . . . V c5,_i e C„_i} and C; = {C; 
G I (cl , cl, . . ., cj;_i, c:i) e Y V cl e Cl, V c| 
G C^, . . . V cj;_i G C;_i} ^ C„ C C'n since Cl C 
Cl, C2 C C^, . . ., C„_i C C;_i. (cf, Lemma[B 

Then, Cl C Ci and C^ C C,' for i = 2, . . . n ^ h{S) C 
h{S'). 

According to (1), (2) and (3), h is a closure operator ■ 
For rt = 4, we instantiate the closure operator of a 
quadratic context, i.e., a d-folksonomy as follows : 

Definition 5: (CLOSURE OPERATOR OF A d-folksonomy) 
Let S = (A, B, C, E) be a quadri-set of with A being 
maximal for ^xBxCxE'CY. The closure operator 
h of a d-folksonomy is defined as follows: 
h(S) = h(A, B, C, E) = (U, T, R, D) \ U = A 
AT = {U £ T \ (u„ t,, r„ dj) e Y V u., e U,\f r, e 
C, V e E} 



A R = {r, e n I (u„ u, n, di) e Y \f u, e U,\f u G 
T,W E} 

A D = {d^ eV \ (u„ ti, n, d,) eY \/ u, e U,\f u G 

T,y r, e R} 

Remark 2: Roughly speaking, h(S) computes the largest 
quadri-set in the d-folksonomy ¥d which contains maximal 
sets of tags, resources and dates shared by a group of 
users. The application of the closure operator h on a quadri- 
set gives rise to a quadri-concept qc = (U, T, R, D). In 
the remainder of the paper, the U, R, T and D parts are 
respectively called Extent, Intent, Modus and Variable. 

Like the dyadic and triadic case, the closure operator splits 
the search space into equivalence classes, that we introduce 
in the following : 

Definition 6: (EQUIVALENCE CLASS) Let Si = (Ai, Bi, 
Cl, El), S2 = (A2, B2, C2, E2) be two quadri-sets of F^ and 
qc be a frequent quadri-concept. Si and S2 belong to the 
same equivalence class represented by the quadri-concept 
qc, i.e.. Si =qc S2 iff h(Si) = h(S2) = qc. 



An Equivalence Class 



The Quadri-Concept 



Quadrl-Sets 




The Quadri- 
Generators 



Figure 1 . Example of an equivalence class extracted from the d-folksonomy 
depicted by Table U 

Minimal Generators (MGs) have been shown to play an 
important role in many theoretical and practical problem 
settings involving closure systems. Such minimal generators 
can offer a complementary and simpler way to understand 
the concept, because they may contain far fewer attributes 
than closed concepts. Indeed, MGs represent the smallest 
elements within an equivalence class. Complementary to 
closures, minimal generators provide a way to characterize 
formal concepts |9|. In the following, we introduce an 
extension of the definition of a MG to the d-folksonomy. 

Definition 7: (QUADRl-MlNlMAL GENERATOR) Let g = 
(A, B, C, E) be a quadri-set of F^ such as A C U, B C 
T, C C TZ and E C T) and qc e QC. The quadruple g is a 
quadri-minimal generator (quadri-generator for short) of qc 
iff h(g) = qc and $ gi = (Ai, Bi, Ci, Ei) such as : 

1) A=Ai, 



2) (Bi C B A Ci C C A El C E) V (Bi C B A Ci 
C C A El C E), and 

3) h(g) = h(gi) = qc. 

Example 3: Let us consider the d-folksonomy shown 
in Table U Figure [T] shows an example of an equivalence 
class. For example, we have h{gi={{ui, U2, us}, <3, ri, di}) 
= {{ui, U2, Us}, {t2, ts, t4}, ri, {di, d2}} = qc such as gi 
is a quadri-generator. Thus, qc is the quadri-concept of this 
equivalence class which is the largest unsubsumed quadri-set 
and it has two quadri-generators. However, 53 = {{ui, U2, 
U3}, {ts, ^4}, ri, di} is not a quadri-generator of qc since 
it exists gi such as gi.extent=gs.extent, gi.intent = g^Antent 
A gi.modus C g^.modus A gi.variable = g^.variable. 

Based on those new introduced notions, we propose in the 
following our new QUADRlCONS algorithm for a scalable 
mining of frequent quadri-concepts from a d-folksonomy. 

B. The QUADRlCONS Algorithm 

In the following, we introduce a test-and-generate algo- 
rithm, called QUADRlCONS, for mining frequent quadri- 
concepts from a d-folksonomy. Since quadri-generators are 
minimal keys of an equivalence class, their detection is 
largely eased. QUADRlCONS operates in four steps as 
follows : the FindMinimalGenerators procedure as a 
first step for the extraction of quadri-generators. Then, the 
ClosureCompute procedure is invoked for the three next 
steps in order to compute respectively the modus, intent 
and variable parts of quadri-concepts. The pseudo code 
of the QUADRlCONS algorithm is sketched by Algorithm 
[T] QUADRlCONS takes as input a d-folksonomy ¥d = iU, 
T, TZ, T), Y) as well as four user-defined thresholds (one 
for each dimension) : minsuppu, minsuppt, minsuppr 
and minsuppd. The output of the QUADRlCONS algorithm 
is the set of all frequent quadri-concepts that fulfil these 
thresholds. QUADRlCONS works as follows : it starts by 
invoking the FindMinimalGenerators procedure (Step 
1), which pseudo-code is given by Algorithm |2l in order to 
extract the quadri-generators stored in the set MQ (Line 3). 
For such extraction, FindMinimalGenerators computes 
for each triple {t, r, d) the set Us representing the maximal 
set of users sharing both tag t and resource r at the date d 
(Algorithm 111 Line 3). If \Us\ is frequent w.rt minsuppu 
(Line 4), a quadri-generator is then created (if it does not 
already exist) with the appropriate fields (Line 5). Algorithm 
|2]invokes the AddQuadri function which adds the quadri- 
generator g to the set M.Q (Line 7). 

Hereafter, QuADRiCoNS invokes the ClosureCom- 
pute procedure (Step 2) for each quadri-generator of AAQ 
(Lines 5-7), which pseudo-code is given by Algorithm[3]: the 
aim is to compute the modus part of each quadri-concept. 
At this step, the two first cases of Algorithm [3] (Lines 3 
and 6) have to be considered w.r.t the extent of each quadri- 
generator The ClosureCompute procedure returns the 
set QS formed by quadri-sets. The indicator flag (equal 



ALGORITHM 1 : QuadriCons 
Data : 

1) Fd iU, T, n, V,Y) : A d-folksonomy. 

2) minsuppu, minsuppt, minsuppr, minsuppd '■ 
User-defined thresholds. 

Results : QC : {Frequent quadri-concepts}. 
1 Begin 



2 I* Step 1 .• The extraction of quadri-generators*! 

3 FlNDMlNIMALGENERATORS(Fd, MQ, 

minsuppu)', 

4 l*Step 2 .• The computation of the modus part*/ 

5 Foreach quadri-gen g ^ MQ do 

6 ClosureCompute(A1(?, minsuppu, 
minsuppt, minsuppr, g, QS, 1); 

7 End 

8 PRUNElNFREQUENTSETS(Q5,TOmsuppt); 

9 l*Step 3 ." The computation of the intent part* I 

10 Foreach quadri-set s e QS do 

11 ClosureCompute( QS, minsuppu, 
minsuppt, minsuppr, s, QS, 2); 

12 End 

13 PRUNElNFREQUENTSETS(Q5,mmsuppr); 

14 l*Step 4 .• The computation of the variable part* I 

15 Foreach quadri-set s G QS do 

16 ClosureCompute( QS, minsuppu, 
minsuppt, minsuppr, s, QC, 3); 

17 End 

18 PRUNElNFREQUENTSETS(QC,mmsuppd); 



19 End 

20 return QC ; 



ALGORITHM 2 : FindMinimalGenerators 

Data : 

1) A4G ■ The set of frequent quadri-generators. 

2) ¥d (U, T, n, V,Y) : A d-folksonomy. 

3) minsuppu '■ User-defined threshold of user's support. 
Results : MQ : {The set of frequent 

quadri-generators } . 

1 Begin 



2 Foreach triple (t, r, d) of¥d do 

3 Us= {ui eU \ (ui, t, r, d) eY} ; 

4 If I I > minsuppu then 

s g. extent = Us', g.intent = r; g.modus = t; 

g .variable = d 

6 If g ^ MG then 

7 AddQuadri(A^C/, g) 

8 End 

9 End 

10 End 



11 End 

12 return MQ ; 



ALGORITHM 3 : ClosureCompute 

Data : 

1) Sin '■ The input set. 

2) miriu, mint, mirir : User-defined thresholds. 

3) q : A quadri-generator/quadri-set. 

4) SouT ■ The output set. 

5) / : an indicator. 

Results : Squt ■ The output set. 



9 
10 



11 



12 
13 
14 
15 



16 

17 
18 



19 



20 
21 
22 
23 



24 



Begin 

Foreach quadri-set q' G 
If i=l and q.intent = q 
C q'. extent then 

s.intent = q.intent;s. extent = 
q.extent;s .variable = q.variab 



^IN do 

q' .intent and q. extent 



extent;.? .variable = q.variable;s. modus 
= q.modus U q' .modus; 
AddQuadri(5o;7t, s); 
End 

Else if i=l and q.intent = q' .intent and q 
and q' incomparable then 

g. extent = q. extent H q' .extent; g.modus 
= q.modus U q' .modus; g.intent = 
q.intent; g.variable = q.variable; 
If g u-frequent then AddQuadri(A^C7, g); 
End 

Else if i=2 and q. extent C q' .extent and 
q.modus C q' .modus and q.intent ^ 
q' .intent then 

qs. extent = q. extent; qs.modus = 

q.modus; qs.variable = q.variable; 

qs.intent = q.intent U q' .intent; 

AddQuadri(5o;7t, qs); 
End 

Else if i=2 and q and q' incomparable then 

s.extent = q.extent fl q' .extent; s.modus 
= q.modus fl q' .modus; s.variable = 
q.variable; s.intent = q.intent U 



q .intent; 

If s is u-frequent and t-frequent then 

AddQuadri(5o;7t, s); 
End 

Else if i=3 and q.extent C q' .extent and 

q.modus C q' .modus and q.intent C 

q' .intent and q.variable ^ q' .variable then 

qc. extent = q.extent; qc.modus = 

q.modus; qc.intent = q.intent; 

qc.variable = q.variable U q' .variable; 

AddQuadri(5o;7t, qc); 
End 

Else if i=3 and q and q' incomparable then 

S .extent = ertent Pi n' p/r.tpn.i' m.nd. 



25 End 

26 End 

27 End 

28 return Squt ; 



q.extent fl q' .extent; s.modus 
= q.modus D q' .modus; s.intent = 
q.intent D q' .intent; s.variable = 
q.variable U q' .variable; 
If s is u-frequent, t-frequent and r-frequent 
then AddQuadri(5oc/t, s); 
d 



to 1 here) marked by QUADRlCONS shows if the quadri- 
set considered by the ClosureCompute procedure is a 
quadri-generator In the third step, QUADRlCONS invokes 
a second time the ClosureCompute procedure for each 
quadri-set of QS (Lines 9-11), in order to compute the 
intent part. ClosureCompute focuses on quadri-sets of 
QS having different intent parts (Algorithm[3] Line 10). The 
fourth and final step of QUADRlCONS invokes a last time 
the ClosureCompute procedure with an indicator equal 
to 3. This will allow to focus on quadri-sets having different 
variable parts (Algorithm |3] Line 18) before generating 
quadri-concepts. QUADRlCONS comes to an end after this 
step and returns the set of the frequent quadri-concepts 
which fulfils the four thresholds minsuppu, minsuppt, 
minsuppr and minsuppd. The QUADRlCONS algorithm 
invokes the PruneInfrequentSets function (Lines 8, 
13 and 18) in order to prune infrequent quadri-sets/concepts, 
i.e., whose the modus/intent/variable cardinality does not 
fulfil the aforementioned thresholds. 

C. Structural properties of QUADRlCONS 

Proposition 2: The QUADRlCONS algorithm is correct 
and complete. It retrieves accurately all the frequent quadri- 
concepts. 

Proof: The FindMinimalGenerators procedure al- 
lows to extract all quadri-generators from the d-folksonomy 
Fd since all the context's triples are enumerated in order 
to group maximal users w.r.t each triple (t,r,d) (Algo- 
rithm 2, Lines 2-10). This allows to extract accurately 
all the quadri-generators. From quadri-generators already 
extracted, QuADRiCoNS calls the ClosureCompute pro- 
cedure three times in order to compute, respectively, the 
modus, intent and variable parts of each quadri-generator. 
At each call, i.e., i = 1, 2, 3, for each couple of candidates 
q and q', two cases have to be considered : 

1) (Algorithm 3, lines 3, 10, 18) g and q' are comparable. 
Hence a quadri-set (quadri-concept when i = 3) is 
created from the union of different parts of both 
candidates. 

2) (Algorithm 3, lines 6, 14, 22) q and q' are incompara- 
ble. Hence, a new quadri-set (quadri-generator when 
i = 1) is created matching the different parts of q and 
q'- 

Thus, all cases of comparison between candidates are 
enumerated. Finally, the PruneInfrequentSets proce- 
dure prune infrequent quadri-concepts w.r.t minimum thresh- 
olds (Algorithm 1, lines 8, 13 and 18). We conclude 
that QUADRlCONS faithfully extracts all frequent quadri- 
concepts. So, it is correct. ■ 

Proposition 3: The QUADRlCONS algorithm terminates. 
Proof: The number of quadri-generators generated by 
QUADRlCONS is finite. Indeed, the number of QGs can- 
didate generated from a context (U, T, TZ, V) is at most 
\T\ X [R] X |2?|. Since the set M.Q of quadri-generators is 



finite, the three loops of Algorithm 1 running this set are 
thus finite. Moreover, the total number of quadri-concepts 
generated by QUADRlCONS is equal to 2l'''l+l^l+l^l There- 
fore, the algorithm QUADRlCONS terminates. ■ 

Theoretical Complexity issues: As in the triadic case |l2], 
the number of (frequent) quadri-concepts may grow expo- 
nentially in the worst case. Hence, the theoretical complexity 
of our algorithm is around 0(2") with n = \T\ + |7^| + \V\. 
Nevertheless, and as it will be shown in the section dedicated 
to experimental results, from a practical point of view, 
the actual performances are far from being exponential 
and QUADRlCONS flags out the desired scalability feature. 
Therefore we focus on empirical evaluations on large-scale 
real- world datasets. 

D. Illustrative example 

Consider the d-folksonomy depicted by Table H] with 
minsuppu = 2, minsuppt = 2, minsuppr = 1 and 
minsuppd = 1. Figure |2] sketches the execution trace of 
QUADRlCONS above this context. As described above, 
QUADRlCONS operates in four steps : 

1) (Step I) The first step of QUADRlCONS involves 
the extraction of quadri-generators iQGs) from the 
context (Algorithm 1, Line 3). QGs are maximal sets 
of users following a triple of tag, resource and date. 
Thus, eleven QGs (among twelve) fulfill the minimum 
threshold minsuppu {cf., Figured Step 1). 

2) (Step 2) Next, QUADRlCONS invokes the CLOSURE- 
COMPUTE procedure a first time on the quadri- 
generators allowing the computation of the modus part 
(the set of tags) of such candidates (Algorithm 1, Lines 
5-8). For example, since the extent part (the set of 
users) of {{ui, M2, U4}, ti, ri, di] is included into 
that of U2, 1*3, U4}, t2, Ti, di}, the modus part of 
the first QG will be equal to {ti, 12]- In addition, new 
QGs can be created from intersection of the first ones 
(Algorithm[3] Lines 6-9) : it is the case of the two QGs 
(a) and (b) {cf.. Figured Step 2). Finally, candidates 
that not fulfill the minimum threshold minsuppt are 
pruned (cf., the three last ones). 

3) (Step 3) Then, QUADRlCONS proceeds to the com- 
putation of the intent part (the set of resources) of 
each candidate within a second call to the Closure- 
Compute procedure (Algorithm 1, Lines 10-13). For 
example, the candidate {{ui, U2, U4}, {ti, 12}, ri, di} 
has an extent, modus and variable included or equal 
into those of the candidate {{ui, U2, U4}, {ti, t2}, 
r2, di}. Then, its intent will be equal to {ri, r2}. At 
this step, four candidates fulfill the minimum threshold 
minsuppr (cf, Figure|2] Step 3). By merging compa- 
rable candidates, this step allow reducing at the same 
time their number 

4) (Step 4) Via a last call to the ClosureCompute 
procedure, QuadriCons computes the variable part 



(the set of dates) of each candidate while pruning 
infrequent ones (Algorithm 1, Lines 15-18). For ex- 
ample, since the candidate {{ui, U2}, {ti, 12}, ri, ^2} 
has an extent, modus and intent included into those of 
{{ui, U2, U4}, {ti, 12}, {ti, r2}, dilE its variable 
will be equal to {di, c?2} (cf. Figured Step 4). 

After die Step 4, QUADRlCONS terminates. The four 
frequent quadri-concepts given as output are : 

1) {{ui, U2, U4}, {^1, h}, {ri, r2}, di} 

2) {{ui, U3, U4}, {h, ta}, {ri, r2}, di} 

3) {{ui, U4}, {ti, t2, h}, [ri, 7-2}, di} 

4) {{ui, U2}, [h, t2}, n, {di, d2}} 
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Figure 2. Execution trace of QUADRlCONS above the d-folksonomy 
depicted by Table U 



VI. Evaluation and Discussion 

In this section, we show through extensive carried out 
experiments, the assessment of the QUADRlCONS perfor- 
mances vs. the state-of-the-art Data-Peeler algorithm in 

^Concretely, it means that the users ui and U2 who shared the resource 
ri with the tags ti and t2 at the date d2 also shared it at the date d\. 



terms of execution timeQ. We also put the focus on the dif- 
ferences between the consumed memory of both algorithms. 
Finally, we compare the number of frequent quadri-concepts 
versus the number of frequent quadri-sets in order to assess 
the compacity of the extracted representation. We have ap- 
plied our experiments on two real-world datasets described 
in the following. Both datasets ll22l are freely downloadabl^l 
and statistics about these snapshots are summarized into 
Table HI] 

• MovieLens (http://movielens.org) is a movie recom- 
mendation website. Users are asked to annotate movies 
they like and dislike. Quadruples are sets of users 
sharing movies using tags at different dates. 

« Last.FM (http://last.fm) is a music website, founded in 
2002. It has claimed 30 million active users in March 
2009. Quadruples are sets of users annotating artists 
through tags at different dates. 





Dataset 1 


Dataset 2 




(MovieLens) 


(Last.FM) 


Type 


Dense 


Sparse 


# Quadruples 


95580 


186479 


# Users 


4010 


1892 


# Tags 


15227 


9749 


# Resources 


11272 (movies) 


12523 (artists) 


# Dates (timestamps) 


81601 


3549 


Periods 


12/01/2005 - 
20/12/2008 


10/01/2007 - 
07/08/2011 



Table II 

Characteristics of the considered snapshots . 



Datasets 


Dates 


Users 


Tags 


Resources 


Movie 
Lens 


03/12/05 
16/07/06 
21/02/08 


krycek 
maria 


kids 
fantasy 
darkness 
magic 


Harry Potter 
The Prisoner 
of Azkaban 
The Order of 
the Phoenix 


Last.FM 


07/05/10 
02/06/11 


csmdavis 

franny 

rossanna 


pop 

concert 

dance 


Britney Spears 
Madonna 



Table III 

Examples of frequent quadri-concepts of MovieLens and 
Last.FM. 



A. Examples of quadri-concepts 

Table Uni shows two examples of frequent quadri-concepts 
extracted from the MovieLens and Last.FM datasets. The 
first one depicts that the users krycek and maria used the 
tags kids, fantasy, darkness and magic to annotate the movie 
Harry Potter and its sequels successively in 03/12/2005, in 

'All implemented algorithms are in C++ (compiled with GCC 4.1.2) 
and we used an Intel Core i7 CPU system with 4 GB RAM. Tests were 
carried out on the Linux operating system Ubuntu 10.10.1. 
http://niovielens. org 



16/07/2006 and then in 21/02/2008. Such concept may be 
exploited further for recommending tags for that movie or 
analyze the evolution of tags associated to "Harry Potter" . 
The second quadri-concept shows that the users csmdavis, 
franny and rossanna shared the tags pop, concert and dance 
to describe the artists Britney Spears and Madonna in 
07/05/10 and then in 02/06/11. We can use such quadri- 
concept to recommend the users franny and rossanna to the 
first one, i.e., csmdavis as they share the same interest for 
both artists using the same tags. It will be also useful to 
study the evolution of the artist's fans and the vocabulary 
they used to annotate them through time. 

In the following, in order to assess the performances of 
QuADRiCONS vs. Data-Peeler while extracting quadri- 
concepts, we ran both algorithms on both datasets and we 
vary the values of minimum thresholds as depicted by Tables 
HVjandlV] 

B. Execution Time 

Tables |IV] and |V] show the different runtimes of the 
QuADRiCoNS algorithm vs. those of Data-Peeler for the 
different values of quadruples, which grows from 20000 
to 95580 for the MovieLens dataset and from 40000 to 
186479 for the Last.FM dataset, and for different values of 
minimum thresholds. We can observe that for both datasets 
and for all values of the number of quadruples, Data- 
Peeler algorithm is far away from QUADRlCONS in terms 
of execution time. QUADRlCONS ran until 332 times faster 
than Data-Peeler on Last.fm and until 124 times on 
MovieLens. Indeed, the poor performance flagged out by 
Data-Peeler, is explained by the strategy adopted by this 
later which starts by storing the entire dataset into a binary 
tree structure, which should facilitate its run and then the 
extraction of quadri-concepts. However, such structure is 
absolutely not adequate to support a so highly sized data, 
which is the case of the real-world large-scale datasets 
considered in our evaluation. Contrariwise, The main thrust 
of the QUADRlCONS algorithm stands in the localisation of 
the quadri-generators (QGs), that stand at the "antipodes" 
of the closures within their respective equivalence classes. 
Then, in an effort to improve the existing work, our strategy 
to locate these QGs have the advantage of making the 
extraction of quadri-concepts faster than its competitor This 
is even more significant in the case of our real-world datasets 
where the number of data reaches thousands. 

C. Consumed Memory 

Tables |IV] and |V] show the memory consumed by both 
algorithms on both datasets for the different values of 
quadruples. We observe that QUADRlCONS consumes mem- 
ory far below its competitor : less than 40000 KB and 
20000 KB on both datasets versus millions of KB for 
Data Peeler. Such difference is explained by the fact 
that QuADRiCoNS, unUke Data Peeler, does not store 



the dataset in memory before proceeding the extraction 
of quadri-concepts. Furthermore, QUADRlCONS generates 
fewer candidates thanks to the clever detection of quadri- 
generators that reduce the search space significantly. For 
example, to extract the 167 quadri-concepts from Last.FM 
when minsuppu = 3, minsuppt = 2, minsuppr = 1 
and minsuppd = 1, QUADRlCONS requires only 1754 KB 
in memory while detecting the 939 quadri-generators of 
the dataset. However, despite the few number of extracted 
quadri-concepts. Data Peeler requires 788021 KB in 
memory to store the entire dataset before generating candi- 
dates. Hence, detecting quadri-generators before extracting 
quadri-concepts allows QUADRlCONS consuming until 54 
and 115 times less memory than Data Peeler on respec- 
tively MovieLens and Last.fm datasets. 





QUADRI 


Consumed 


Data 


Consumed 


Y 


Cons 




Memory 


Peeler 


Memory 




(sec) 


(kilobytes) 


(sec) 


(kilobytes) 




minsuppu 


= 3, minsuppt = 2, 






minsuppr 


= 1, minsuppd = 1 




25000 


0. 86 




542 


43.10 


209843 


50000 


2. 05 




1361 


110.72 


378907 


70000 


3. 08 




1760 


198.33 


509541 


95580 


4. 61 




2087 


288.00 


654761 




minsuppu 


= 2, minsuppt = 2, 






minsuppr 


= 2, minsuppd = 1 




25000 


0. 36 




198 


39.98 


399672 


50000 


0. 97 




431 


107.71 


508943 


70000 


1 .96 




567 


227.65 


667006 


95580 


3. 79 




1182 


472.87 


842551 




minsuppu 


= 2, minsuppt = 2, 






minsuppr 


= 1, minsuppd = 1 




25000 


5.76 




2491 


421.44 


769822 


50000 


15.92 




5246 


1269.70 


976200 


70000 


29.22 




9845 


2037.73 


1153401 


95580 


48.92 




16556 


3478.98 


1446242 




minsuppu 


= 2, minsuppt = 1, 






minsuppr 


= 1, minsuppd = 1 




25000 


97. 56 




10982 


1022.12 


1272988 


50000 


188. 61 




14671 


1987.06 


1561992 


70000 


263. 63 




19548 


2876.02 


1751258 


95580 


528. 58 




38762 


5965.94 


2098452 



Table IV 

Performances of QuadriCons i'.v. Data-Peeler above the 
MovieLens dataset. 



D. Compacity of Quadri-Concepts 

Figure [3] shows the number of frequent quadri-concepts 
versus the number of frequent quadri-sets on both Movie- 
Lens and Last.FM datasets for the different values of 
quadruples. We observe that for both datasets, the number of 
frequent quadri-sets increase massively when the number of 
quadruples grows. Indeed, frequent quadri-concepts become 
more large, i.e., containing more users, tags, resources and 





Quadri 


Consumed 


Data 


Consumed 


1 Y 


Cons 




Memory 


Peeler 


Memory 




(sec) 


(kilobytes) 


(sec) 


(kilobytes) 




minsuppu 


= 3, minsuppt = 2, 






minsuppr 


= 1, minsuppd = 1 




40000 


0. 05 




114 


7.13 


309453 


80000 


0. 10 




342 


28.12 


445431 


120000 


0. 22 




656 


61.60 


550932 


150000 


0. 45 




1241 


119.45 


678542 


186479 


0. 77 




1754 


255.71 


788021 




minsuppu 


= 2, minsuppt = 2, 






minsuppr 


= 2, minsuppd = 1 




40000 


0. 39 




177 


32.29 


456323 


80000 


0. 53 




421 


57.06 


590012 


120000 


1. 60 




782 


182.40 


698672 


150000 


3. 39 




1025 


354.71 


826862 


186479 


5. 87 




1672 


496.55 


932871 




mtnsuppu 


= 2, minsuppt = 2, 






minsuppr 


= 1, minsuppd = 1 




40000 


0. 84 




1876 


51.88 


498672 


80000 


2. 94 




3891 


201.58 


780762 


120000 


8. 71 




6789 


487.92 


1198451 


150000 


17. 81 




11342 


1049.34 


1343572 


186479 


29. 78 




14562 


1949.14 


1552789 




minsuppu 


= 2, minsuppt = 1, 






minsuppr 


= 1, minsuppd = 1 




40000 


2. 91 




6724 


89.77 


1008273 


80000 


6. 87 




11562 


221.93 


1336451 


120000 


21. 87 




14345 


724.47 


1542006 


150000 


46. 52 




15623 


1524.76 


1772919 


186479 


88. 16 




18976 


3118.85 


2188452 



Table V 

Performances of QuadriCons vs. Data-Peeler above the 
Last.fm dataset. 



dates. Thus, such concepts cause the steep increase of 
frequent quadri-sets. For both datasets, the frequent quadri- 
concepts represent until 3. 68 % and 28. 99 % of the number 
of frequent quadri-sets. Hence, computing frequent quadri- 
sets is a harder task than computing frequent quadri-concepts 
while providing the same information. 

VII. Conclusion and Perspectives 

In this paper, we considered the quadratic context formally 
described by a d-folksonomy with the introduction of a new 
dimension : time stamp. Indeed, we extend the notion of 
closure operator and tri-generator to the four-dimensional 
case and we thoroughly studied their theoretical properties. 
Then, we proposed the QUADRICONS algorithm in order 
to extract frequent quadri-concepts from d-folksonomies. 
Several experiments show that QUADRICONS provides an 
efficient method for mining quadri-concepts in large scale 
conceptual structures. It is important to highlight that mining 
quadri-concepts stands at the crossroads of the avenues 
for future work : (i) analyse evolution of users, tags and 



Number Df Quadri-Concepts vs. that of Quadri-Sets 



10000 




40000 80000 120000 150000 186479 
Number of Quadruples 
Number of Quadr-Concepts vs that of Quadri-Sets 



100000 




Number of Quadruples 



Quadri-Concepts(l) 
Quadrl-Sets(l) 

Quadri-Concepts(3) 
Quadrl-Sets(3) 



Quadri-Concepts(2) 
Quadri-Sets(2) 

Quadri-Concepts(4) 
Quadri-Sets(4) 



(1) mlnsupp_u=3,mlnsupp_t=2,mlnsupp_r=1 ,minsupp_d=1 

(2) minsupp_u=2,min5upp_t=2,min5upp_r=2,minsupp_d=1 

(3) minsupp_u=2,mmsupp_t=2,minsupp_r=1 ,minsupp_d=1 

(4) mlnsupp_u=2,mlnsupp_t=1 ,mlnsupp_r=1 ,minsupp_d=1 

Figure 3. Number of frequent quadri-concepts vs. number of frequent 
quadri-sets on both datasets. (Top) Last.FM (Bottom) MovieLens 



resources through time, (ii) define the quadratic form of 
association rules according to quadri-concepts. 
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Abstract — Folksonomy mining is grasping the interest of 
web 2.0 community of as far user as freely tag resources. 
However, a scrutiny of the related work unveils that the time 
stamp dimension has not been considered. For example, the 
wealthy number of works dedicated to mining tri-concepts from 
folksonomies did not take into account time dimension. In this 
paper, we will consider a folksonomy commonly composed of 
triples <users, tags, resources> and we shall consider the time 
as a new dimension. We motivate our approach by highlighting 
the battery of potential applications and we introduce a new 
algorithm, called QuadriCons, as an extension of Tricons 
dedicated to the triadic contexts. QuadriCons aims at getting 
out quadratic concepts, i.e., quadri-concepts from quadratic 
contexts. We also introduce a new closure operator that 
splits the induced search space into equivalence classes whose 
smallest elements are the quadri-minimal generators. Carried 
out experiments on large-scale real-world snapshots of social 
networks highlight very interesting results about analyzing 
trend detection in folksonomies starting from quadri-concepts. 

Keywords -Quadratic Context; Formal Concept Analysis; 
Quadratic Concepts; Folksonomies; algorithm 

I. Introduction 

-FCA extended depuis 95 au Triadic Concept Analysis : 
not much attention coming of folksonomies 

-Folksonomy (definition informelle, rise of folk) / web 2.0 
[1] Folksonomy (from folk and taxonomy) is a neologism 
for a practice of collaborative categorization using freely 
chosen keywords. Folksonomies (also called social tagging 
mechanisms) have been implemented in a number of online 
knowledge sharing environments since the idea was first 
adopted by social book- marking site del.icio.us in 2004. 
The idea of a folksonomy is to allow the users to describe 
a set of shared objects with a set of keywords of their own 
choice. 

-folksonomy mining triconcepts : repre. condense des 
folk tricons les surpasse 

-timestamp forgot: importance du temps en 1 frase 

Time is considered one of the most important factors in 
detecting emerging subjects, -in this paper, confluence of 
both lines of research : FCA (cas 4-aire:QCA) + mining 
sequential patterns, donner exemple de harry potter 

The remainder of the paper is organized as follows. 
Section 2 recalls the key notions used throughout this paper 
We thoroughly study the related work in Section 3. In 
Section 4, we introduce a new closure operator for the 
quadratic context as well as the QUADRlCONS algorithm 
dedicated to the extraction of frequent quadri-concepts. In 
Section 5, carried out experiments about performances of 



our algorithm and analyzing trend detections. Finally, we 
conclude the paper with a summary and we sketch some 
avenues for future works in Section 6. 

II. Motivation : Conceptual and Temporal 
Clustering of Folksonomies 

The immediate success of social networks, i.e., social 
resource sharing systems is due to the fact that no specific 
skills are needed for participating. Each individual user is 
able to share a web page', a personal photo-, an artist he 
like^ or a movie he watched** without much effort. 

The core data structure of such systems is a folkson- 
omy. It consists of three sets U, T, R of users assigning 
tags to resources as well as a ternary relation Y between 
them. To allow conceptual and temporal clustering from 
folksonomies, an additional dimension is needed : time. 
Within this new dimension, our goal is to detect hidden 
sequential conceptualizations in folksonomies. An exemple 
of such a concept is that users which tagged "Harry Potter" 
will tag "The Prisoner of Azkaban" and then tag "The Order 
of the Phoenix", probably with the same tags. 

Our algorithm solves the problem of frequent closed 
patterns mining for this kind of data. It will return a set 
of (frequent) quadruples, where each quadruple (U, T, R, 
D) consists of a set U of users, a set T of tags, a set 
R of resources and a set D of dates. These quadruples, 
called (frequent) quadri-concepts, have the property that 
each user in U has tagged each resource in R with all 
tags from T at the different dates from D, and that none 
of these sets can be extended without shrinking one of 
the other three dimensions. Hence, they represent the four- 
dimensional extension of tri-concepts. Moreover, we can add 
minimum support constraints on each of the four dimensions 
in order to focus on the largest concepts of the folksonomy, 
i.e., by setting higher values of minimum supports. 

III. The Problem of Mining all Frequent 
Quadri-Concepts 

In this section, we formalize the problem of mining all 
frequents quadri-concepts. We start with an adaptation of the 
notion of folksonomy to the quadratic context. 

Definition 1: (D-FOLKSONOMY) A d-folksonomy is a set 
of tuples ¥d = (U, r, 7^, V, Y) where U, T, U and V ai-e 

'http://del.icio.us 
^http://fliclCT.com 
'http://last.fm 
■*http://movielens.org 



finite sets which elements are called users, tags, resources 
and dates. YCUxTxTlxV represents a quaternary 
relation which each y QY can be represented by a quadruple 
: y = {(u, t, r, d) \ u G U, t e T, r G 11, d G V} which 
means that the user u has annoted the resource r using the 
tag t at the date d. 

Example 1: Table I depicts an example of a d-folksonomy 
Fd with U = {ui, U2, Us, U4}, T = {h, t^, U}, TZ = {ri, 
and V = {di, ^2}. Each cross within the quaternary relation 
indicates a tagging operation by a user from 14, a tag from 
T and a resource from 7?. at a date from V, i.e., a user has 
tagged a particular resource with a particular tag at a date 
d. For example, the user m has tagged the resource ri with 
the tags ^1, t2 and ts at the date di. 
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A d-folksonomy. 

The following definition introduces a (frequent) quadri- 
set. 

Definition 2: (A (Frequent) quadri-set) Let = 
{U, T, TZ, T>, Y) be a d-folksonomy. A quadri-set of 
is a quadruple {A, B, C, E) with ACU, BCT,CCn 
mdE CV such asAxBxCxECY. 

D-Folksonomies have four dimensions which are com- 
pletely symmetric. Thus, we can define minimum support 
thresholds on each dimension. Hence, the problem of mining 
frequent quadri-sets is then the following: 

Problem 1: (Mining all frequent quadri-sets) Let = 
(U, T, TZ, V, Y) be a d-folksonomy and let minsuppu, 
minsuppt, minsuppr and minsuppd be user-defined mini- 
mum thresholds. The task of mining all frequent quadri-sets 
consists in determing all quadri-sets (^4, B, C, E) of with 
I A I > minsuppu, \ B \ > minsuppt, | C | > minsuppr 
and I I > minsuppd- 

Our thresholds are antimonotonic constraints : If (Ai, Bi, 
Ci, El) with Ai being maximal for Ai x Bi x Ci x Ei 
C Y is not u-frequent^ then all {A2, B2, C2, E^) with Bi C 
B2 and Ci C C2 are not u-frequent either. The same holds 
symmetrically for the other two dimensions. In [2], the au- 
thors demonstrate that above the two-dimensional case, the 
direct symmetry between monotonicity and antimonotonicity 
breaks. Thus, they introduced a lemma which results from 

'with regard to the users dimension. 



the triadic Galois connection [?] induced by a triadic context. 
In the following, we adapt that lemma to our quadratic case. 

Lemma 1: Let (Ai, Bi, Ci, Ei) and {A2, B2, C2, E2) 
be quadri-sets with Ai being maximal for Ai x Bi x Ci x 
Ei C Y, for i=l,2. If Bi C B2, Ci C C2 and Ei C E2 
then A2 C Ai. The same holds symmetrically for the other 
three directions. In the sequel, the inclusion (Ai, Bi, Ci, 
El) C (A2, B2, C2, E2) holds if and only if Bi C B2, Ci 
C C2, El C E2 and A2 C Ai . 

Example 2: Let F^ be the d-folksonomy of Table I and 
let Si = {{w3, U4}, ts, {n, r2}, {di, ^2}} and S2 = {{ui, 
"3, U4}, {t2, ts}, {ri, r2}, di} be two quadri-sets of F^. 
We have Si C S2 since {1x3, U4} C [ui, U3, U4}, ts C {t2, 
ts}, {ri, r2} C {n, r2} and di C {di, ^2}. 

As the set of all frequent quadri-sets is highly redundant, 
we consider a specific condensed representation, i.e., a 
subset which contains the same information : the set of all 
frequent quadri-concepts. The latter's definition is given as 
follows : 

Definition 3: ((Frequent) quadratic concept) A 
quadratic concept (or a quadri-concept for short) of a d- 
folksonomy F^ = (U, T, TZ, V, Y) is a quadruple {U, T, 
R, D) with U CU,T CT, RCTiandD CV with U x 
T X R X D CY such that the quadruple (U, T, R, D) is 
maximal, i.e., none of these sets can be extended without 
shrinking one of the other three dimensions. A quadri- 
concept is said to be frequent whenever it is a frequent 
quadri-set. 

Problem 2: (Mining all frequent quadri-concepts) Let 

¥d = {U, T, TZ, V, Y) be a d-folksonomy and let minsuppu, 
minsuppt, minsuppr and minsuppd be user-defined min- 
imum thresholds. The task of mining all frequent quadri- 
concepts consists in determing all quadri-concepts {U, T, 
R, D) of Fd with I ?7 I > minsuppu, \ T \ > minsuppt, 
I i? I > minsuppr and \ D \ > minsuppd. The set of all 
frequent quadri-concepts of F^ is equal to QC = {QC \ QC 
= (U, T, R, D) is a frequent quadri-concept}. 

Remark 1: It is important to note that the extracted repre- 
sentation of quadri-concepts is information lossless. Hence, 
after solving Problem 2, we can easily solve the Problem 
1 by enumerating all quadri-sets {A, B, C, E) such as it 
exists a frequent quadri-concept (JJ, T, R, D) such as A C 
U,BQT,C(ZR, E^Ddiad\A\> minsuppu, \ B | 
> minsuppt, \ C \ > minsuppr and | -E | > minsuppd- 

In the remainder, we will scrutinize the state-of-the-art 
propositions aiming to mine quadratic concepts from d- 
folksonomies. 

IV. Related Work 

-voutsadakis 
-datapeeler 

sans critiques! !!(laisser pr comparaison) 

In [3], Voutsadakis generahzed the constructs and results 
of WUle [2] to the n-adic contexts. The author gives a 



definition of an n-adic concept as well as that of a complete 
n-lattice of a n-adic context. Moreover, it was shown that the 
n-adic concepts of an n-adic context K form a complete n- 
lattice with respect to component-wise defined quasi-orders. 
To illustrate those new definitions, Voutsadakis gives an 
example of quadratic concepts and their associated com- 
plete Boolean 4-lattice. Despite robust theoretical study, no 
algorithm has been proposed by Voutsadakis for an efficient 
extraction of quadratic concepts. In addition, despite that of 
a n-adic concept, no basic notion of data mining (minimal 
generator, equivalence class, etc.) was adapted to the n-adic 
context. Finally, no potential applications were proposed in 
order to illustrate the usefulness of such concepts. Recently, 
Cerf et al. proposed the Data-Peeler algorithm [4] which 
is able to extract all closed concepts from n-ary relations. 
It enumerates all the n-dimensional closed patterns in a 
depth first manner using a binary tree enumeration strategy. 
When n = 4, the Data-Peeler algorithm is able to extract 
quadratic concepts. However, Data-Peeler is hampered by 
the large number of elements that may contain any of the 
dimensions and its strategy becomes ineffective and leads to 
a complex computation of n-adic concepts. In the following, 
we review some approaches dealing with trend detection in 
folksonomies which illustrates the usefulness of the quadratic 
concepts and the consideration of the time dimension in 
folksonomies. 

V. The QuadriCons Algorithm for Mining all 
Frequent Quadri-Concepts 

A. Main Notions of QuadriCons 

Before introducing our closure operator for a d- 
folksonomy, we define a closure operator of a n-adic context. 
In [5], Voutsadakis define n-closure operators for a n-adic 
context. Each i-closure operator aims to compute the closed 
part related to the dimension i for a given n-set (1 < i < 
n). In what follows, we introduce a new closure operator 
h which is able to compute the closure of a given n-set. 
Contrariwise to [5], we use a single closure operator that 
computes a single time all closed parts of the resulting n- 
adic concept. 

Definition 4: (CLOSURE OPERATOR OF A n-ADIC CON- 
TEXT) Let S = {Si, S2, ■ ■ ; Sn) be a n-set, with 5*1 being 
maximal for 5i x ... x 5„ C Y, of a n-adic context 
K" with n dimensions, i.e., K" = (Di, V2, . . ., V^, Y). 
A mapping h is defined as follows : 

h{S) = h{Si, S2, . . ., S„) = id, C2, . . ., C„) such as : 

Ci = 5*1 

A C2 = {^2 e ^2 I (c\, CI, c|, ...,<) e Y V 4 G Ci, 

A c„ = {c; G Vn I (4, 4, . . ., 4_i, Q) e y v 4 g 

Ci, ...V<_i gC„_i} 
Proposition 1: his a closure operator. 



Proof: To prove that his a closure operator, we have to 
prove that this closure operator fulfills the three properties 
of extensivity, idempotency and isotony [6]. 

(1) Extensivity : Let S" = (Si, S2, ■ ■ ■, Sn) be a n-set 
of K" ^ h(S) = (Ci, C2, . . ., C„) such that : Ci = Si, 
C2 = {CI G V2 I (4, CI, c|, . . ., 4) G Y V 4 G Ci, 
V 4 G 53, . . ., V 4 G Sn} 3 52 since Ci D since 

Ci = Si , ...,Cn = {c\ G Vn I (4, 4. • • 4-1. Ci) 
G Y V 4 G Ci, V 4 G C2, . . . V 4_i G c„_i} 3 Sn 

since Ci = 6*1, C2 ^ S2, ■ ■ -, C„_i 3 Sn-i- Then, Ci = 
.Si and C Ci for i = 2, . . . n ^ 5 C h{,S) {cf. Lemma 1) 

(2) Idempotency : Let S = (^i, S'2, . . ., Sn) be a n-set 
of K" ^ h(S) = (Ci, C2, . . ., Cn) =^ h{Ci, C2, ■ ■ ., Cn) 
= (C[, C'2, . . ., C'n) such that : C[ = Ci, C'^ = {C^ G V2 \ 

(4, c^', 4, . . ., 4) G Y V 4 G Ci, V 4 G ^3, . . ., V 4 

G Sn} = C2 Since Ci = Si, . . ., C; = {C^ G P„ | (4, 4 , 

4_i, c;) G Y V 4 G c[, V 4 G c^, ... V 4_i G 

J = Cn since we have C[ = Ci, C^ = C2, ■ ■ ■, C;_i 
= Cn-i. Then, C'^ = Ci for i = 1, ... n ^ h{h{S)) = h{S) 

(3) Isotony : Let 5 = (Si, S2, ■ ■ ■, Sn) and S' = (S[, S'^, 
. . ., S'J be two n-sets of K" with S C S', i.e., S[ C 
and Si C S- ior i = 2, . . . n (cf. Lemma 1). We have hiS) 
= (Ci, C2, . . ., Cn) and hiS') = (C(, C^, . . ., C'J such that 

. Ci = Si, C[ = S[ and S[ Q Si ^ C[ <^ Ci 
. C2 = {C\ G V2 I (4, c\, 4, . . ., 4) G Y V 4 G Ci, 
V 4 G ^3, • ■ V 4 G Sn} and C'^ = {CI G V2 \ {c\, 
ci, 4, . . ., 4) G Y V 4 G Ci, V 4 G ^3, ■ . V 4 
G Sn} C2 C C2 since S, C for i = 3, . . . n and 
C'l^Ci. (cf. Lemma 1) 

. Cn = {Ci G Vn I (4, 4, . . ., 4_i, c;) G Y V 4 G 

Ci, V 4 G C2, . . . V 4_i G Cn-i} and C; = {Ci 

G Vn I (4, 4'. • ■ c'^) e Y V 4 G c[, V 4 

G C^, . . . V 4_i G ^ C„ C C; since C[ C 

Ci, C2 C C^, . . ., Cn-1 C (c/. Lemma 1) 

Then, C Ci and Cj C c; for i = 2, . . . n ^ h{S) C 
hiS'). 

According to (1), (2) and (3), his a closure operator. ■ 
For n=4, we instanciate the closure operator of a quadratic 

context, i.e., a d-folksonomy as follows : 
Definition 5: (Closure operator of a d- 

FOLKSONOMY) Let S = (A, B, C, E) be a quadri-set of 

F(i with A being maximal for A x S x C x C Y. 

The closure operator h of a d-folksonomy is defined as 

follows: 

h(S) = h(A, B, C, E) = (U, T, R, D) \ U = A 

AT = {ti gT \ (ui, ti, n, di) gy y Ui € u,y n € 

C,V di G E} 

A i? = {n G 7^ I (ui, ti, n, di) gy y m g u,y u € 

T,\f diG E} 



A D = {di eV \ (u„ ti, ri, d,) E Y y Ui e U,V U e 

T,W n e R} 

Remark 2: Roughly speaking, h(S) computes the largest 
quadri-set in the d-folksonomy Vd which contains maximal 
sets of tags, resources and dates shared by a group of users. 
The application of the closure operator /i on a quadri-set 
gives rise to a quadri-concept QC = iU, T, R, D). In the 
remainder of the paper, the U, R, T and D parts are respec- 
tively called Extent, Intent, Modus and Variable. 

Like the dyadic and triadic case, the closure operator splits 
the search space into equivalence classes, that we introduce 
in the following : 

Definition 6: (EQUIVALENCE CLASS) Let Si = {Ai, Bi, 
Ci, El), S2 = (A2, B2, C2, E2) be two quadri-sets of and 
QC be a frequent quadri-concept. Si and 52 belong to the 
same equivalence class represented by the quadri-concept 
QC, i.e.. Si =Qc S2 iff h(Si) = h{S2) = QC. 



An Equivalence Class 




Figure 1 . Example of an equivalence class extracted from the d-folksonomy 
depicted by Table I 

Minimal Generators (MGs) have been shown to play an 
important role in many theoretical and practical problem 
settings involving closure systems. Such minimal generators 
can offer a complementary and simpler way to understand 
the concept, because they may contain far fewer attributes 
than closed concepts. Indeed, MGs represent the smallest 
elements within an equivalence class. Complementary to 
closures, minimal generators provide a way to characterize 
formal concepts [7]. In the following, we introduce an 
extension of the definition of a MG to the d-folksonomy. 

Definition 7: (QUADRl-MlNlMAL GENERATOR) Let g = 
{A, B, C, E) be a quadri-set of such a.^ A QU, B <Z 
T, C C TZ and E C V and QC e QC. The quadruple g is 
a quadri-minimal generator (quadri-generator for short) of 
QC iff h(g) = QC and $ gi = (Ai, Bi, Ci, Ei) such as : 

1) A = Ai, 

2) {Bi <Z B A Ci Q C A El d E) \f {Bi <Z B A Ci 
G C A El C E), and 



3) Kg) = h(gi) = QC. 

Example 3: Let us consider the d-folksonomy ¥d shown 
in Table I. Figure 1 shows an example of an equiva- 
lence class. For example, we have h{gi={{ui, U2, U3}, 
^3, ri, di}) = {{ui, U2, u^}, {t2, h, ti}, ri, {di, ^2}} 
= QC such as gi is a quadri-generator. Thus, QC is 
the quadri-concept of this equivalence class. The largest 
unsubsumed quadri-set QC has two quadri-generators gi 
and 52- However, 53 = {{ui, U2, u^], {h, ti], ri, di] 
is not a quadri-generator of QC since it exists gi such 
as gi.extent=g^.extent, gi.intent = g^.intent A gi.modus C 
g^.modus A gi.variable = g^.variable. 

Based on those new introduced notions, we propose in the 
following our new QUADRlCONS algorithm for a scalable 
mining of frequent quadri-concepts from a d-folksonomy. 

B. The QUADRlCONS Algorithm 

In the following, we introduce a test-and-generate algo- 
rithm, called QUADRlCONS, for mining frequent quadri- 
concepts from a d-folksonomy. Since quadri-generators are 
minimal keys of an equivalence class, their detection is 
largely eased. QUADRlCONS operates in four steps as 
follows : the FindMinimalGenerators procedure as a 
first step for the extraction of quadri-generators. Then, the 
ClosureCompute procedure is invoked for the three next 
steps in order to compute respectively the modus, intent 
and variable parts of quadri-concepts. The pseudo code 
of the QUADRlCONS algorithm is sketched by Algorithm 
I. QUADRlCONS takes as input a d-folksonomy F<j = (U, 
T, TZ, T>, Y) as well as four user-defined thresholds (one 
for each dimension) : minsuppu, minsuppt, minsuppr 
and minsuppd. The output of the QUADRlCONS algorithm 
is the set of all frequent quadri-concepts that fulfill these 
thresholds. QUADRlCONS works as follows : it starts by 
invoking the FindMinimalGenerators procedure (Step 
1), which pseudo-code is given by Algorithm 2, in order to 
extract the quadri-generators stored in the set A4Q (Line 3). 
For such exti-action, FindMinimalGenerators computes 
for each triple (<, r, d) the set Us representing the maximal 
set of users sharing both tag t and resource r at the date d 
(Algorithm 2, Line 3). If \Us\ is frequent w.r.t minsuppu 
(Line 4), a quadri-generator is then created (if it does not 
already exist) with the appropriate fields (Line 5). Algorithm 
2 invokes the AddQuadri function which adds the quadri- 
generator g to the set A4Q (Line 7). 

Hereafter, QUADRlCONS invokes the ClosureCompute 
procedure (Step 2) for each quadri-generator of J^Q (Lines 
5-7), which pseudo-code is given by Algorithm 3 : the 
aim is to compute the modus part of each quadri-concept. 
At this point, the two first cases of Algorithm 3 (Lines 3 
and 6) have to be considered w.r.t the extent of each 
quadri-generator. The ClosureCompute procedure returns 
the set QS formed by quadri-sets. The indicator fiag (equal 



ALGORITHM 3 : ClosureCompute 

Data : 

1) Sin '■ The set of frequent quadri-generators/quadri-sets. 

2) minu, mint, mirir : User-defined thresholds of extent, modus and intent support. 

3) g : A quadri-generator/quadri-set. Squt '■ {The set of frequent quadri-sets/quadri-concepts}. 

4) i : an indicator. 

Results : Squt '■ {The set of frequent quadri-sets/quadri-concepts}. 
1 Begin 



2 Foreach quadri-set q' £ Sin do 

3 If i=l and q.intent = q'. intent and q.extent C q' .extent then 

4 s.intent = q.intent;s. extent = q.extent;s. variable = q.variable;s. modus = q.modus U q' .modus; 

ADDQUADRl(5oi7T, s); 

5 End 

6 Else if i=l and q.intent = q'. intent and q and q' incomparable then 

7 g.extent = q.extent fl q'. extent; g.modus = q.modus U q'. modus; g.intent = q.intent; g.variable = 
q.variable; 

8 li I g.extent \ > minu then AddQuadri(A^^, g); 

9 End 

10 Else if i=2 and q.extent C q'. extent and q.modus C q'. modus and q.intent ^ q'. intent then 

11 QC. extent = q.extent; QC.modus = q.modus; QC.variable = q.variable; QC.intent = q.intent U 

q'. intent; 

12 ADDQUADRl(5oi7T, QC); 

13 End 

14 Else if i=2 and q and q' incomparable then 

15 s.extent = q.extent Ci q' .extent; s.modus = q.modus fl q' .modus; s.variable = q.variable; s.intent = 
q.intent U q' .intent; 

16 If I S.extent \ > minu and | s.modus \ > mint then ADDQuADRi(<Soi7T, s); 

17 End 

18 Else if i=3 and q.extent C q'. extent and q.modus C q' .modus and q.intent C q'. intent and q.variable ^ 
q'. variable then 

19 QC. extent = q.extent; QC.modus = q.modus; QC.intent = q.intent; QC.variable = q.variable U 
q' .variable; 

20 ADDQUADRI(5o!7T, QC); 

21 End 

22 Else if i=3 and q and q' incomparable then 

23 s.extent = q.extent Ci q'. extent; s.modus = q.modus fl q'. modus; s.intent = q.intent fl q'. intent; 
s.variable = q.variable U q' .variable; 

24 If I s.extent \ > minu and | s.modus \ > mint and | s.intent \ > minr then AddQuadri(iSoc/t, s); 

25 End 

26 end 



27 End 

28 return Squt ; 



ALGORITHM 1 : QuadriCons 
Data : 

1) ¥d QA, T, n, V,Y) : A d-folksonomy. 

2) minsuppu, minsuppt, minsuppr, minsuppd '■ 
User-defined thresholds. 

Results : QC : {Frequent quadri-concepts}. 

1 Begin 

2 l*Step 1 ; The extraction of quadri- generators* I 

3 FlNDMlNIMALGENERATORS(Fd, MQ, 

minsuppu)', 

4 l*Step 2 ; The computation of the modus part*l 

5 Foreach quadri- gen g € M.Q do 

6 ClosureCompute{M.Q, minsuppu, 
minsuppt, minsuppr, g, QS, 1); 

7 end 

8 PRUNElNFREQUENTSETS(Q5,TOmSMppt); 

9 l*Step 3 ; The computation of the intent part*/ 

10 Foreach quadri-set s G QS do 

11 ClosureCompute( QS, minsuppu, minsuppt, 
minsuppr, s, QC, 2); 

12 end 

13 PRUNElNFREQUENTSETS(QC,TOms-«pPr-); 

14 l*Step 4 ; The computation of the variable part*/ 

15 Foreach quadri-set s e QS do 

16 ClosureC omputei QS, minsuppu, minsuppt, 
minsuppr, s, QC, 3); 

17 end 

18 PRUNElNFREQUENTSETS(TC,mmSUpp<i); 

19 End 

20 return QC ; 



to 1 here) marked by QuadriCons shows if the quadri- 
set considered by the ClosureCompute procedure is a 
quadri-generator. In the third step, QUADRlCONS invokes 
a second time the ClosureCompute procedure for each 
quadri-set of QS (Lines 9-11), in order to compute the 
intent part. ClosureCompute focuses on quadri-sets of 
QS having different intent parts (Algorithm 3, Line 
9). The fourth and final step of QuadriCons invokes 
a last time the ClosureCompute procedure with an in- 
dicator equal to 3. This will allow to focus on quadri- 
sets having different variable parts before generating 
quadri-concepts. QuADRiCONS comes to an end after this 
step and returns the set of the frequent quadri-concepts 
which fulfills the four thresholds minsuppu, minsuppt, 
minsuppr and minsuppd- The QuadriCons algorithm 
invokes the PruneInfrequentSets function (Lines 8, 
13 and 18) in order to prune infrequent quadri-sets/concepts, 
i.e., whose the modus/intent/variable cardinality does not 
fulfill the aforementioned thresholds. 

C. Structural properties of QuadriCons 

Proposition 2: The QUADRlCONS algorithm is correct 
and complete.lt retrieves accurately all the frequent quadri- 
concepts. 
Proof: 

■ 

Proposition 3: The QuadriCons algorithm terminates. 
Proof: 

■ 

Theoretical Complexity issues: 



ALGORITHM 2 : FindMinimalGenerators 

Data : 

1) MG '■ The set of frequent quadri-generators. 

2) ¥d {U, T, n, V,Y): A d-folksonomy. 

3) minsuppu : User-defined threshold of user's support. 
Results : MG ■ {The set of frequent 

quadri-generators} . 

1 Begin 



2 Foreach triple (t,r,d) of¥d do 

3 Us= {u, e U I (u„ t, r, rf) G Y} ; 

4 If I C^s I > minsuppu then 

5 g.extent = Us', g.intent = r; g.modus = t; 
g.variable = d 

6 If 5 MG then 

7 AddQuadri(A^^, g) 

8 End 

9 End 

10 end 



11 End 

12 return MG ; 



D. Illustrative example 

Consider the d-folksonomy depicted by Table I, with 
minsuppu = 2, minsuppt = 2, minsuppr = 1 and 
minsuppd = 1. Figure 2 sketches the execution trace of 
QuadriCons above this context. As described above, 
QuadriCons operates in four steps : 

Step 1 The first step of QuadriCons involves the extrac- 
tion of quadri-generators {QGs) from the context. 
QGs are maximal sets of users following a triple 
of tag, resource and date. Thus, the eleven QGs 
that fulfill the minimum threshold minsuppu are 
described by Figure 2 (Step 1). 
Step 2 Next, QuadriCons invokes the ClosureCom- 
pute procedure a first time on the quadri- 
generators allowing the computation of the modus 
part of such candidates. For example, since the 
extent part of {{ui, U2, U4}, ti, ri, di} is included 
into that of {{ui, U2, U3, U4}, t2, ri, di}, the 
modus part of the first QG will be equal to {ti, 
^2}. Moreover, new QGs can be created from the 
intersection of the first ones (see Algorithm 3, Line 
xx) : it is the case of the two QGs (a) and (b) {cf. 



Figure 2, 2). Finally, candidates that not fulfill the 
minimum threshold minsuppt are pruned (c/, the 
three last ones). 

Step 3 Then, QUADRlCONS proceeds at the computation 
of the intent part of each candidate within a 
second call to the ClosureCompute procedure. 
For example, the candidate {{ui, U2, u^}, {ti, ^2}, 
ri, di} has an extent, modus and variable included 
or equal into those of the candidate M2, U4}, 
{ti, 12], r2, di}. Then, its intent will be equal 
to {ri, r2}. At this step, four candidates fulfill the 
minimum thresholds over the intent part (Figure 
2,Step 3). By merging comparable candidates, this 
step allow reducing at the same time their number. 

Step 4 Via a last call to the ClosureCompute proce- 
dure, QuADRlCONS computes the variable part 
of each candidate while pruning infrequent ones. 
Since the candidate M2}, {^i, ^2}, ri, ^2} has 
an extent, modus and intent included into those of 
{{ui, U2, U4}, {ti, t2}, {ri, r2}, di}^, its variable 
will be equal to {di, ^2}. 

After the Step 4, QuadriCons terminates. The four 
frequent quadri-concepts given as output are : 

1) {{wi, U2, U4}, {h, t2}, {ri, r2}, di} 

2) {{wi, W3, U4}, {t2, h], {ri, r2}, di] 

3) {{ui, U4}, {ti, t2, ts}, {ri, r2}, di} 

4) {{ui, U2}, {h, t2}, ri, {di, d2}} 

VI. Evaluation and Discussion 

In this section, we show through extensive carried out 
experiments, the assessment of the QuadriCons perfor- 
mances vs. Data-Peeler. We also put the focus on the 
differences between the consumed memory of both al- 
gorithms. Moreover, we compare the number of frequent 
quadri-concepts versus the number of frequent quadri-sets in 
order to assess the compacity of the extracted representation. 
We have applied our experiments on two real- world datasets 
described in the following. Statistics about these snapshots 
are summarized into Table II. 

MovidllEPratELENS {http://movielens.org) is a movie rec- 
ommendation website. Users are asked to note 
movies they hke and dishke. The MovieLens 
dataset used for our experiments is freely down- 
loadable [8]. 

LAST.itMlst.fm (http./Aast.ftn) is a music website, founded 
in 2002. It has claimed 30 million active users in 
March 2009. The Last.FM dataset used for our 
experiments is freely downloadable [8]. 

Table III shows two examples of frequent quadri-concepts 
extracted from the MovieLens and Last.fm datasets. The 

^Concretely, it means that the users ui and U2 which shared the resource 
ri with the tags ti and t2 at the date d2 also shared it at the date di. 





Dataset 1 


Dataset 2 




(MovieLens) 


(Last.FM) 


# Type 


Dense 


Sparse 


# Quadruples 


95580 


186479 


# Users 


4010 


1892 


#Tags 


15227 


9749 


# Resources 


11272 (movies) 


12523 (artists) 


# Dates (timestamps) 


81601 


3549 



Table II 

Characteristics of the considered snapshots. 



Datasets 


Dates 


Users 


Tags 


Resources 


MovieLens 


03/12/2005 
16/07/2006 
21/02/2008 


krycek 
maria 


kids 
fantasy 
darkness 
magic 


Harry Potter 
The Prisoner 
of Azkaban 
The Order of 
the Phoenix 


Last.fm 


07/05/2010 
02/06/2011 


csmdavis 

franny 

rossanna 


pop 

concert 

dance 


Britney Spears 
Madonna 



Table III 

Examples of frequent quadri-concepts of MovieLens and 
Last.fm. 



first one depicts that the users krycek and maria used the 
tags kids, fantasy, darkness and magic to annote the movie 
Harry Potter and its sequels successively in 03/12/2005, in 
16/07/2006 and then in 21/02/2008. Such concept may be 
exploited further for recommanding tags for that movie or 
analyze the evolution of tags associated to "Harry Potter". 
The second quadri-concept shows that the users csmdavis, 
franny and rossanna shared the tags pop, concert and 
dance to describe the artists Britney Spears and Madonna 
at two different dates. We can use such quadri-concept to 
recommand the users franny and rossanna to the first one, 
i.e., csmdavis as they share the same interest for both artists 
using the same tags. 

A. Execution Time 

B. Consumed Memory 

C. Compacity of Quadri-Concepts 

VII. Conclusion and Perspectives 

-ccl 
-persp 
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# Quadruples 


Minimum 


QUADRlCONS 


Data 


Minimum 


QUADRlCONS 


Data 


Minimum 


QuadriCons 


DAT/^ 


# Quadruples 


Thresholds 


QUADRlCONS 


Peeler 


Thresholds 


QUADRlCONS 


Peeler 


Thresholds 


QUADRlCONS 


Peelei 



Table IV 

Performances of QuadriCons vs. Data-Peeler above the MovieLens and Last.fmdatasets. 



